feat: add helm/michelangelo Helm chart for control plane deployment#1143
Open
feat: add helm/michelangelo Helm chart for control plane deployment#1143
Conversation
Summary: Intent: - Convert the Michelangelo control plane from raw kubectl apply calls in sandbox.py to a first-class Helm chart installable against any Kubernetes cluster (closes #1136) - Enable helm install for local k3d development and standard --set overrides for production and staging environments Changes: - Add helm/michelangelo/ chart with Chart.yaml, values.yaml, values-k3d.yaml, and 20 templates covering all 5 control plane services (apiserver, envoy, ui, worker, controllermgr) promoted to Deployments - Add schema init containers on apiserver (wait-for-metadata-storage + schema-init) eliminating the ordering race condition in sandbox.py - Replace boot.yaml cluster-admin with a least-privilege ClusterRole scoped to what controllermgr and apiserver crdSync actually need - Rename minio-credentials to object-storage-credentials in chart templates - Add per-service enabled toggles, Cadence/Temporal engine guards with fail-fast validation, and helm test hook Test Plan: - helm lint → 0 errors, 0 failures - helm template with no values → fails fast with clear required-value error - helm template -f values-k3d.yaml → 21 resources render clean - helm template with full production values (Temporal) → 21 resources render clean - helm install -f values-k3d.yaml against live k3d → all 5 pods Running - helm test michelangelo → Phase: Succeeded - helm upgrade --reuse-values → zero pod restarts - helm uninstall → credential Secrets survive (resource-policy: keep) Revert Plan: - Revert this PR via git revert. The helm/ directory is additive and sandbox.py is unchanged — no production behavior is affected. Closes #1136
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What type of PR is this? (check all applicable)
What changed?
Adds
helm/michelangelo/— a first-class Helm chart that installs the full Michelangelo control plane (apiserver, envoy, ui, worker, controllermgr) into any Kubernetes cluster with a singlehelm installcommand. All 5 services are promoted from bare Pods to Deployments. Includes least-privilege RBAC, schema init containers, per-service enabled toggles, Cadence/Temporal engine guards, credential Secrets withresource-policy: keep, and ahelm testhook.Why?
The control plane was previously deployed by
sandbox.pyvia sequentialkubectl applycalls with hardcoded addresses and no self-healing. A Helm chart enables standard install/upgrade/uninstall lifecycle, works against any cluster (local k3d, staging, production), and is required for open-source users who don't use themaCLI. Closes #1136.How did you test it?
helm lint helm/michelangelo→ 0 errors, 0 failureshelm templatewith no values → fails fast with"workflow.endpoint is required"helm template -f values-k3d.yaml→ 21 resources render cleanhelm templatewith full production values (Temporal engine) → 21 resources render cleanhelm template --set workflow.engine=invalid→ clear validation errorhelm install -f values-k3d.yamlagainst live k3d cluster → all 5 pods Running within 60shelm test michelangelo→ Phase: Succeededhelm upgrade --reuse-values→ zero pod restartshelm uninstall→ credential Secrets survive (resource-policy: keep confirmed)Potential risks
None — this PR only adds new files under
helm/. No existing code is modified;sandbox.pycontinues to deploy the control plane viakubectl applyunchanged until Phase 4 integration.Release notes
N/A — additive change, no migration required.
Documentation Changes
helm/michelangelo/README.mddocuments prerequisites, install commands (k3d and production), full values reference, upgrade/uninstall instructions, and troubleshooting. No wiki changes needed.